A Comparative Study of Support Vector Machines Applied to the Supervised Word Sense Disambiguation Problem in the Medical Domain
نویسندگان
چکیده
We have applied five supervised learning approaches to word sense disambiguation in the medical domain. Our objective is to evaluate Support Vector Machines (SVMs) in comparison with other well known supervised learning algorithms including the näıve Bayes classifier, C4.5 decision trees, decision lists and boosting approaches. Based on these results we introduce further refinements of these approaches. We have made use of unigram and bigram features selected using different frequency cut-off values and window sizes along with the statistical significance test of the log likelihood measure for bigrams. Our results show that overall, the best SVM model was most accurate in 27 of 60 cases, compared to 22, 14, 10 and 14 for the näıve Bayes, C4.5 decision trees, decision list and boosting methods respectively.
منابع مشابه
A Comparative Study of an Unsupervised Word Sense Disambiguation Approach
Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms: support vector machines...
متن کاملA Comparative Study of Supervised Learning as Applied to Acronym Expansion in Clinical Reports
Electronic medical records (EMR) constitute a valuable resource of patient specific information and are increasingly used for clinical practice and research. Acronyms present a challenge to retrieving information from the EMR because many acronyms are ambiguous with respect to their full form. In this paper we perform a comparative study of supervised acronym disambiguation in a corpus of clini...
متن کاملKernel Methods for Word Sense Disambiguation and Acronym Expansion
The scarcity of manually labeled data for supervised machine learning methods presents a significant limitation on their ability to acquire knowledge. The use of kernels in Support Vector Machines (SVMs) provides an excellent mechanism to introduce prior knowledge into the SVM learners, such as by using unlabeled text or existing ontologies as additional knowledge sources. Our aim is to develop...
متن کاملA Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels
The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...
متن کاملAn Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bay...
متن کامل